Fix(target-allocator): accept prometheus receiver with only target_allocator block#5085
Fix(target-allocator): accept prometheus receiver with only target_allocator block#5085sashetov wants to merge 1 commit into
Conversation
|
|
|
I don't think you need to set You should change this in some our e2e test configs - if they pass, it'll confirm that your change is valid. Finally, you'll need to use an email address in your commit metadata that is linked to your GH account and sign the CLA with said account. |
23eb5c8 to
5581200
Compare
E2E Test Results 38 files 265 suites 2h 21m 21s ⏱️ Results for commit 9a58fd9. ♻️ This comment has been updated with latest results. |
|
@swiatekm I dropped the |
The TargetAllocator CRD tests are not passing, can you have a look? |
Signed-off-by: Alex Vassilevski <alexander@vassilevski.com>
5581200 to
9a58fd9
Compare
|
This should be fixed now. I added in the configmap builder, plus a unit test in |
swiatekm
left a comment
There was a problem hiding this comment.
Looks good to me overall, thank you for the contribution! I've got a few more comments about the e2e tests, but the actual implementation is ready.
Could you also search through the integration tests for blocks like config: {} and delete them? They're not necessary anymore, and this would provide further e2e testing for your change.
| prometheus: | ||
| endpoint: 0.0.0.0:9090 |
There was a problem hiding this comment.
nit: I don't think there's a reason to use a more complex exporter here. nopexporter should do just fine and be faster.
| kind: ClusterRole | ||
| metadata: | ||
| name: collector-prometheuscr-denyfs | ||
| name: collector-no-promconfig |
There was a problem hiding this comment.
It's safer to make the name the same as the ClusterRoleBinding, reduce the chance of collisions with other tests. Those have historically been a source of strange bugs.
Fix(target-allocator): accept prometheus receiver with only target_allocator block
Closes #2998
Summary
When an
OpenTelemetryCollectorCR has a prometheus receiver containing only atarget_allocator:block (noconfig:/scrape_configs:), reconciliation fails with"no prometheusConfig available as part of the configuration". this is wrong - that shape is a legitimate "TA-only" mode where the target allocator supplies scrape configuration externally (typically from discovered PrometheusCR objects, or a configmap mounted on the TA, outside the operator's surface).Note: a 2025-09-17 comment on the issue claimed this can't be reproduced, but I can't confirm that — the bug report shape (receiver with
target_allocator:only, noconfig:at all) still triggers the error onmain. unit-level repro below.What Changed
internal/manifests/targetallocator/adapters/config_to_prom_config.go-AddTAConfigToPromConfignow skips the scrape_configs strip whenconfig:is absent instead of erroring; whenconfig:is present the existing strip still runs.ValidateTargetAllocatorConfigreturnsnilwhenconfig:is absent (TA-only mode signal), instead of cascading the missing-config error fromGetScrapeConfigsFromPromConfig.internal/manifests/targetallocator/adapters/config_to_prom_config_test.go- the previously-asserts-bug sub-case ("missing config property"→ expects the old error) is removed from the negative test, and two positive sub-cases are added:"TA-only mode: no config block, only target_allocator set"and"TA-only mode: no config block, no target_allocator block". both assert no spuriousconfig:key is inserted on the way out.TestValidateTargetAllocatorConfiggets its"receiver config empty and PrometheusCR disabled"row flipped from "expect error" to "expect nil", plus a new explicit"no config block but target_allocator set, PrometheusCR disabled"row.tests/e2e-targetallocator/targetallocator-no-promconfig/- new chainsaw e2e: deploys anOpenTelemetryCollectorwithreceivers: { prometheus: {} }andtargetAllocator.enabled: true+prometheusCR.enabled: true, then asserts the collector and TA come up Ready and that the rendered collector configmap contains the operator-injectedtarget_allocatorblock underprometheus:with noconfig:key..chloggen/fix_target-allocator-only-config.yaml- bug_fix entry.Reproduction
a throwaway test (not committed) calls both functions with the exact config from the bug report. at HEAD^ both fail with the bug's error; at HEAD both pass:
The regression coverage that lands in this PR is in
config_to_prom_config_test.goand the new chainsaw test undertests/e2e-targetallocator/targetallocator-no-promconfig/; the throwaway above is just to demonstrate the before-state in a single test file at the same commit.Design
config: {}map so downstream code could keep writing into it. swiatekm pointed out the prometheus receiver doesn't require aconfig:block at all, so the synthesis is unnecessary. the current diff just guards thedelete(prometheusCfg, "scrape_configs")behind the presence check — if there's noconfig:, there are no scrape_configs to strip and nothing else inAddTAConfigToPromConfigtouchesprometheusCfg. the rendered collector config stays minimal (just the operator-injectedtarget_allocatorblock) and the chainsaw assert pins that shape.config:is absent and TA is enabled (this whole code path runs only when TA is enabled), there is genuinely nothing to validate locally. the TA owns its scrape-config source.target_allocator.endpointuser-supplied value is still overwritten by the operator (unchanged). that's existing behavior; this PR doesn't change endpoint semantics, only the validation/strip path.Tests
make precommitnot run end-to-end locally (envtest binaries download is slow on first run); the targetedgo testruns above cover the same packages. the new chainsaw test runs under the operator's e2e-targetallocator suite in CI.Notes
make ensure-update-is-noopshould be a no-op - verified by not editing anything underapis/.ValidateTargetAllocatorConfigchange out into a separate PR if maintainers prefer; both pieces are needed to fully resolve Target Allocator doesn't accept only target_allocator config in the collector config #2998 since both functions are called from the webhook (collector_webhook.go:330-337).